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ABSTRACT 

The purpose of this paper is to link empirical Bayes 
methods with two specific topics in item response 
theory — item/subtest regression, and testing the goodness of fit of 
the Rasch model — under the assumptions of local independence and 
sufficiency. It is shown that item/subtest regression results in 
empirical Bayes estimates only if the Rasch model holds. 
Additionally, it is shown that a newly-derived exploratory 
goodness-of-fit test for the Rasch model, which does not need item 
and person parameter estimates, can be seen as an empirical Bayes 
test. This test compares the observed proportions of correct answers 
to one specific item, given any pattern that leads to a number-right 
score. These proportions shuuld be a: out equal. (RLC) 
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Abstract 

Empirical Bayes methods are linked with item response theory 
under the assumptions of local independence and suf f ciency. 
It is shown that item-subtest regression results in empirical 
Bayes estimates if and only if the Rasch model holds. 
Additionally, it will be shown that a newly derived, 
exploratory, Rasch model test can be seen as an empirical 
Bayes test. 

Key-words ! Empirical Bayes estimation, Rasch model, item 
subtest regression, sufficiency. 
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Introduction 

The purpose of this paper is to link empirical Bayes methods 
with two specific topics in item response theory (IRT) : item- 
subtest regression and testing the goodness of fit of the 
Rasch model. It will be shown that item-subtest regression 
results in empirical Bayes estimates if and only if the Rasch 
model (Rasch, 1980) holds. Additionally, it will be shown 
that a Rasch model (RM) test can be derived as an empirical 
Bayes test from first principles. 

To briefly introduce empirical Bayes estimation, let us 
assume that an observation z of a random variable Z is made, 
where the distribution function of Z depends on a parameter 
0, i.e., the distribution function of Z is given by F(2|e). 
In empirical Bayes estimation, just as in ''classical*' Bayes 
estimation, the assumption of a prior distribution function 
G(6) for the unknown parameter 6 is essential. In addition, 
however, "previous" data is used to obtain a (in some way) 
reasonable estimator of G(6), and hence of 6. In some cas3S 
it is possible to obtain an enpirdcal Bayes estimator of 6 
without explicit estimation of G(6). For a mor3 detailed 
introduction to empirical Bayes methods, see, for instance, 
Robbins (1955, 1964) or Maritz (1970). 

Sample item-subtest regressions, which are used by Lord 
(1980) to construct estimators for item-response functions, 
are obtained by computing the proportion of examinees that 
answered one specific item correctly, given their number- 
correct scores on the test minus the item concerned. 
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Empirical Bayes Methods in irt Models 

As an introduction to the concept of empirical Bayes 

estimation in iRT models, we will first explain how the 

concept works in the simplest iRT model, the binomial model. 

In this model, all K items have the same item 

characteristic curve (xCC), i.e. %{9) m p(X^^l\Q) = 8 for all 

m » 1,...,K, where 6 is the (latent) ability. So, the total 

score Y is binomially distributed with parameters K and 6. 

The Bayes point estimator 8q(y) of the mean ICC, which is 

denoted by t(6) m ^'Ln(6)/K, under squared error loss is 
m=l 

given by 



(1) 8G(y) = 



/ epj^(y;e)dG(G) 
/ PK(y;8)dG(e) 



where PK(y;8) is the density or discrete probability function 
of y given 9, and G(9) the prior distribution function of 9 
(Lehmann, 1983) . Note that T(9) = 9. For the binomial case 
(1) leads to 



(y+1) n^. . (y+1) 

(2) 5G(y) = 



(K+1) Pj^(y) 



where Pk+i(z) denotes the probability of obtaining a total 
score of z on a (K+1) -item test (Meredith and Kearns, 1973). 
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Now suppose that a random sample of examinees of size N 
is available. Vhe empirical Bayes estimate of 0 is then 
obtained by simply substituting observed frequencies for 
theoretical probabilities. Since scores on a (K+l)-item test 
can nor be observed however, (2) must be modified in such a 
way that only observable quantities remain. Apart from some 
technical details, this is done by computing empirical Bayes 
estimates on the subtests in which the m-th (m»l,...,K) item 
has been deleted, and evaluating the mean of these estimates 
to get a more stable result. It can be shown that these 
estimates are consistent and that the empirical Bayes risk 
converges to the Bayes risk. More details can be found in 
Cressie (1982), Jannarone (1979), Meredith and Kearns (1973), 
Kearns and Meredith (1975) and Robbins (1955) . 

It should be mentioned that some sort of smoothing of 
empirical Bayes estimates may be useful, such as constraining 
the estimates to increase mon tonically in the total score y 
(van Houwelingen, 1977/ Jannarone, 1979) . 

With the binomial model as a guide-line, it is now easy 
to evaluate enpirical Bayes estimates for the iCC's for 
general irt models, without making any assumptions whatsoever 
on the form of the ICC (i.e., non-parametrically) . 

We want to compute the empirical Bayes estimate of the 
m-th ICC, tn^(e). As in the binomial model, this will done by 
using the responses on subtest S^^^j containing all but the 
m-th item. In contrast with the binomial model, where all 
items are equivalent, this estimate has to be computed for 
each subtest response vector separately. Hence, the empirical 
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Bayes estimate will be based on x^j^), denoting the response 
vector on S . in the following we will use subscripts (m) 
to denote operations were the m-th item is excluded. 
We will need the following Assumption: 

(3) P(Xn, - 1, X(n,, -X(n,,ie) - P(Xn,.l|e) P (X -x 1 8) , 

for all m and all vectors xjj,^). For that miitter, it can be 
easily shown that (3) is equivalent with the assumption of 
local independence. 

With (3) and the above notations, the Bayes estimate of 
%(6) based on x^j^^ is defined by 

Rewriting (4) leads to 

/ x„(e)p(X(„, -x(n,, le)dG(e) 
■ ip(X(n„ - x(„,ie)dG(e) • 

Now suppose that 

(6) P (X(n^, -X(n^,ie) - q(s(X(n^,)/e)h(X(ni,) 

for some functions q and h and for all x^.^), or equivalently 
suppose that the sufficient factorization criterion holds on 
the subtest S ^J^) • Then 
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/ 1 (e)q(s(x ,_.)/e)ciG(e) 

(7) 8 (X ) ^ 



i q(s(x(„,)/e)dG(e) 



^m) <y(m) ) 



provided that s (x (j^j ) - s (y (m) ) • Hence, in this case, the 
Bayes estimates for %(6) based on two response patterns 
d^ci y(ni) are equal if the sufficient statistic for these 
patterns are equal. But we have sufficiency of the item- 
deleted test score in the RM as well as in the two-parameter 
logistic model with known (not necessary equal) 
discrimination parameters only. So, for the RM, the Bayes 
estimate of che m-th ICC can now be based on the sufficient 

statistic R/jn) * £ X . This estimate is easily obtained by 
* ' ni*m n 

substituting r^J^^ for in formula (4), which leads to 

P(X,n"^' R{m)"^(m) ) 



<8) «(m)<Nm)'^(m)) 



P<iMm)"2: (m) ) 



which is the usual item true-score regression (Lord, 1960, 
p. 251-252). 

The empirical Bayes estimate of %(6) based on R^^^^^ is 
now obtained by substituting observed proportions for 
theoretical probabilities, i.e., 



<3> »(m)<'«(m)'^(m)) " 



«(m)"^(m)) 
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Note that our results point towards a possible goodness* 
of-fit test for the m-th item, given that the RM holds for 
the subtest S : If the observed proportion of examinees 
answering the m-th item correctly for a subtest score is 
roughly equal for each pattern that leads to the subtest 
score, Chen the RM should also hold for the m-th item, 
otherwise it should not. Such a test would be explorative, in 
a way that resembles the 'splitter' technique proposed by 
Stelzl (1979) and also described by Molenaar (1963). 

Discussion 

It has been shown that an empirical Bayes justification for 
using item-subtest regressions as item response function 
estimates holds only for the Rasch models. Indeed, (4) and 
(6) show that item-subtest regression is inappropriate 
otherwise. Similar conclusions may hold regarding the use of 
number-correct scores in other IRT models as well. 

Additionally, an (exploratory) goodness-of-fit tost for 
the Rasch model, which does not need item and person 
parameter estimates has been indicated. This test compares 
the observed proportions of correct answers to one specific 
item, given any pattern that leads to a number-right score 
of, say, R' on the remaining items. These proportions should 
be about equal. 
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